Depth estimation from single monocular images is a key component of sceneunderstanding and has benefited largely from deep convolutional neural networks(CNN) recently. In this article, we take advantage of the recent deep residualnetworks and propose a simple yet effective approach to this problem. Weformulate depth estimation as a pixel-wise classification task. Specifically,we first discretize the continuous depth values into multiple bins and labelthe bins according to their depth range. Then we train fully convolutional deepresidual networks to predict the depth label of each pixel. Performing discretedepth label classification instead of continuous depth value regression allowsus to predict a confidence in the form of probability distribution. We furtherapply fully-connected conditional random fields (CRF) as a post processing stepto enforce local smoothness interactions, which improves the results. Weevaluate our approach on both indoor and outdoor datasets and achievestate-of-the-art performance.
展开▼